SEMI-AUTOMATIC GENERATION OF A CORPUS OF WIKIPEDIA ARTICLES ON SCIENCE AND TECHNOLOGY Generación semi-automática de un corpus de artículos de Wikipedia sobre ciencia y tecnología

نویسندگان

  • Julià Minguillón
  • Maura Lerga
  • Eduard Aibar
  • Josep Lladós-Masllorens
  • Antoni Meseguer-Artola
چکیده

Despite the huge amount of scientific and technological content available on the World Wide Web, most of it is closed behind paywalls, as with academic journals, or almost invisible, as with institutional repositories. Wikipedia can act as a chain-transfer agent, providing people with an accessible, organized structure containing both understandable content and links to original sources. In Wikipedia, categories are collaboratively created and thus become a folksonomy rather than a true taxonomy. Consequently, categories are not a reliable tool to identify topics’ organization. In this paper we describe a semi-automatic method, based on random walks, for determining a subset of pages containing scientific and technological content in the Spanish Wikipedia. Using the Unesco taxonomy, we determined the underlying graph structure of our corpus and detected clusters of pages strongly linked, establishing relationships between knowledge domains. Finally, we present the distribution of Wikipedia articles according to the Unesco taxonomy and the resulting map of scientific and technological content.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

فایل کامل مجلّه مطالعات زبان فرانسه دو فصلنامه علمی پژوهشی زبان فرانسه دانشکده زبانهای خارجی دانشگاه اصفهان

Tâ ÇÉÅ wx W|xâ Revue des Études de la Langue Française Revue semestrielle de la Faculté des Langues Étrangères de l'Université d'Ispahan Cinquième année, N° 8 Printemps-Eté 2013, ISSN 2008- 6571 ISSN électronique 2322-469X Cette revue est indexée dans: Ulrichsweb: global serials directory http://ulrichsweb.serialssolutions.com Doaj: Directory of Open Access Journals http://www.doaj.org ...

متن کامل

Expanded granular sludge bed bioreactor in wastewater treatment

The expanded granular sludge bed bioreactor appears today as a cheap, robust and more popular technology because it operates using a fluidized bed, which allows increasing in organic load and in cell retention times, generating higher treatment efficiencies (up to 95 %) and renewable energy (i.e., biogas, biomethane, and biohydrogen). Nevertheless, the efficiency of this bioreactor mainly depen...

متن کامل

Estimation of Economic Values for Fertility, Stillbirth and Milk Production Traits in Iranian Holstein Dairy Cows

The objective of present study was to derive the economic values for number of inseminations to conception, calving interval, milk yield and stillbirth, using economic data of 10 Iranian Holstein herds. The economic values were derived by using the profit function methods and differentiating a profit equation with respect to the traits of interest. The cow fertility costs herd amortization or r...

متن کامل

Crisis in Science and Technology in Colombia.

En el desarrollo de un país, la inversión que se hace en ciencia, tecnología e innovación está ligada a sus condiciones de bienestar de la población y a su calidad de vida. Son los resultados en ciencia y tecnología los que dan solución a los problemas inherentes a la comunidad cuando se investiga con pertinencia y se trata de encontrar respuestas adecuadas a las dificultades de una nación. Es ...

متن کامل

Science , religion and economic development

The correlations between scientometric indices, macroeconomic variables and results from attitude polls in different countries were explored. The results show that a minimum threshold of economic development (around GDP per capita of 1000 US$) is required for science and the economy of a country to interact. After that threshold, a positive interaction between economic development, scientific d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017